Generic functional parallel algorithms: scan and FFT
نویسندگان
چکیده
منابع مشابه
Generic Parallel Algorithms
We develop a nature-inspired generic programming language for parallel algorithms, one that works for all data structures and control structures. We show that any parallel algorithm satisfying intuitively-appealing postulates can be modeled by a collection of cells, each of which is an abstract state machine, augmented with the ability to spawn new cells. The cells all run the same algorithm an...
متن کاملFFT Algorithms for SIMD Parallel Processing Systems
SIMD (single instruction stream-multiple data stream) algorithms for one-and two-dimensional discrete Fourier transforms are presented. Parallel structurings of algorithms for efficient computation for a variety of machine size/problem size combinations are presented and analyzed. Through these algorithms, techniques for exploiting relationships between problem size and machine size are demonst...
متن کاملImplementation and performance evaluation of parallel FFT algorithms
Fast Fourier Transform (FFT) algorithms are widely used in many areas of science and engineering. Some of the most widely known FFT algorithms are Radix-2 algorithm, Radix-4 algorithm, Split Radix algorithm, Fast Hartley transformation based algorithm and Quick Fourier transform. In this paper, the first three algorithms listed are implemented in the sequential and MPI (message passing interfac...
متن کاملEfficient Parallel Scan Algorithms for GPUs
Scan and segmented scan algorithms are crucial building blocks for a great many data-parallel algorithms. Segmented scan and related primitives also provide the necessary support for the flattening transform, which allows for nested data-parallel programs to be compiled into flat data-parallel languages. In this paper, we describe the design of efficient scan and segmented scan parallel primiti...
متن کاملParallel Prefix (Scan) Algorithms for MPI
We describe and experimentally compare three theoretically well-known algorithms for the parallel prefix (or scan, in MPI terms) operation, and give a presumably novel, doubly-pipelined implementation of the in-order binary tree parallel prefix algorithm. Bidirectional interconnects can benefit from this implementation. We present results from a 32 node AMD Cluster with Myrinet 2000 and a 72-no...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ACM on Programming Languages
سال: 2017
ISSN: 2475-1421
DOI: 10.1145/3110251